Cache memory device

ABSTRACT

A cache memory device including an input/output (ESRQ) for receiving a request (REQ) having a main address (AP) and optional data (D); an input/output (ESMP) to an addressable main memory (MP) or another addressable cache memory; a plurality of X memory banks (BCi) wherein i is lower than X and higher than 0, each having a number Li of lines for containing data, the lines being individually designated by a local address (AL) in each bank; an arrangement for answering a request (REQ) by connecting the main address (AP) in the request to a local address (AL) in the bank (BCi) in accordance with a predetermined la (fi) for each bank (BCi), whereby the line thus designated in the bank (BCi) is the only line to contain the datum referred to by the main address; and an arrangement (CHA) for loading the cache memory according to the received requests. At least two predetermined laws (fi) are substantially distinct depending on the banks in question, and the two banks in question are addressed separately, hereby the average cache memory data access hit rate is improved.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention relates to the technological field of buffer or cachememories.

It finds a general application in data processing systems.

2. Background Information

Technological progress, in particular in the field of clock speeds andthe integration of processors, tends to reduce increasingly the cycletimes of the said processors and to permit the sequencing and executionof several instructions per cycle.

It follows therefrom that there is an increasingly heavy demand on thedata flow in the main memory of a data processing system.

However, the technological progress has not made it possible to reducethe time of access to the data in the main memory at the same rate asthe cycle times of the processors.

Indeed at present, the access time in the main memory is often of theorder of several tens of processing cycles, even hundreds of processingcycles.

One known solution for masking the latency of the access to the data inthe main memory lies in using cache memories (Computing Surveys, Vol.14,No.3, September 1982, pages 473-530, “Cache Memories”).

In general, a cache memory is a fast access memory, generally of a smallsize, wherein a part of the set of data stored in the main memory isstored.

In practice, when a processor makes a request comprising a main addressin the main memory, and possibly data, the cache memory responds to thesaid request either by connecting the main address contained in thisrequest and a data line of the cache memory, when the desired data itemis present and valid in the cache memory, or by signalling that it isabsent in the opposite case. In this latter case, the processoraddresses the main memory for accessing the desired data item. The dataline in the main memory containing the desired data item can then beloaded into the cache memory.

Several cache memory systems are known, in particular the directrecording system also called “direct mapped”, the wholly associativemultibank system and the associative multibank set system (EP-A-0 334479). These systems will be described in greater detail below.

It is clear that the use of cache memories accelerates the time ofaccess to the data in the main memory, thanks to the fact in particularthat the cache memories are faster than the main memories.

Nevertheless, the effective performance of data processing systems usingcache memories depends on the average hit rate during access to the datain the said cache memories.

Now, this average hit rate is not entirely satisfactory in the abovementioned systems of cache memories.

SUMMARY OF THE INVENTION

The object of the invention is precisely that of improving the averagehit rate during access to the data in the cache memories.

The invention relates to a cache memory device using a multibank system.

In the known way, the cache memory device comprises:

at least one request input/output for receiving a request for access toa data item stored in the addressable main memory or in anotheraddressable cache memory, comprising a main address, and possibly data;

at least one main memory input/output connected to the main addressablememory for accessing the said desired data item of the main memory;

a plurality of memory banks, each having a number of lines capable ofcontaining data, these lines being capable of being individuallydesignated by a local address in each bank;

computing means connected to the request input/output and capable ofanswering the request by connecting the main address contained in thisrequest to a local internal address in each of the banks, the line thusdesignated in the bank being the only line of the said bank that iscapable of containing the data labelled by the main address;

loading means connected to the main memory input/output for loading thedata line of the main memory containing the desired data item into thecache memory when it is not present in the cache memory.

According to a general definition of the invention, the computing meansestablish the said relation between the main address and the localaddress in the bank in accordance with a predetermined law associatedwith the said bank; at least two of the predetermined laws are distinctaccording to the banks in question; and the two banks in question areaddressed separately, according to their respective law.

BRIEF DESCRIPTION OF THE DRAWINGS

Other characteristics and advantages of the invention will emerge in thelight of the following detailed description and drawings wherein:

FIG. 1 is a schematic view of a known data processing system;

FIG. 2 is a schematic view of a cache memory arranged in accordance witha known system termed “wholly associative”;

FIG. 3 is a schematic view of a cache memory arranged in accordance witha known system termed “direct mapped”;

FIG. 4 is a schematic view of a multibank cache memory arrangedaccording to a known system termed “associative by set”

FIG. 5 is a schematic view of a multibank cache memory arranged inaccordance with an system that is associative by set, modified accordingto the invention; and

FIG. 6 is a schematic view of a law allowing the main address containedin a request to be connected to a local address in a bank in accordancewith the invention.

DETAILED DESCRIPTION

In Fig. 1, the reference SI designates a data processing system using acache memory AM.

The cache memory AM includes L lines individually represented at LO toLL each containing M words of B bits. For example, the cache memory AMcomprises 64 lines of 8 words of 16 bits. The product L×B×M defines thebit size of the cache memory.

Data D and optionally tags T are stored in the lines of the cachememory.

The tags T serve for example to determine the main address AP in themain memory of the data stored in the line of the cache memory and toindicate the validity of the said data. It should be observed that it isthe main address that is stored in the tag and which makes it possibleto effect the connection between the data item in the cache memory andits address in the main memory.

When a processor PRO wishes to access a data item stored in the mainmemory MP, it makes at first a request REQ comprising the main addressAP of the said desired data in the main memory MP and optionally thedata.

The request REQ is then received by means forming the input/output ESRQconnected to means CAL that are capable of answering the request byconnecting the main address AP contained in this request and localaddresses AL of the cache memory AM in accordance with laws that arepredetermined for the cache memory.

Loading means CHA load the cache memory according to the receivedrequests.

When the desired data item is present in the data line L of the cachememory labelled by the local address AL, the processor accesses thedesired data item.

In the opposite case, the processor addresses the main memory via theinput/output means ESRQ for accessing the desired data item in the mainmemory.

The data line in the main memory containing the desired data can then beloaded into the cache memory in accordance with predetermined laws.

In the known way, the system of the cache memory differs according toits associativeness.

In FIG. 2, a cache memory has been represented arranged according to ansystem termed wholly associative.

In such an system, the loading means CHA load the data lines of the mainmemory into any line of the cache memory, and this irrespective of themain address of the data line in the main memory.

Such an system necessitates mechanisms for access to the cache memory ofa considerable size and a prohibitive access time when the number oflines of the cache memory is large, since it is necessary to read thetag T of the presence of data of all the lines of the cache memory, andto compare the main address AP with the local address AL of the dataline of the cache memory, this main address, as well as the informationconcerning the validity of the line being stored in a tag that isassociated with the data line.

In FIG. 3, a cache memory has been represented that is arrangedaccording to the system termed “direct mapped”.

In such an system, the loading means CHA load or “map” the data lines inthe main memory into lines of the cache memory whose respective localaddress AL is directly derived from the main address AP, most frequentlyby taking the significant bits of a lower weighting.

The system termed “direct mapped ” is relatively simple. Indeed,starting from a data line in the cache memory, a single word and itsassociated presence tag T are read first of all. Subsequently, the localaddress AL of the line tag thus read, is compared with the main addressAP to be loaded. In the case of an positive comparison, the data line inthe main memory is loaded into the cache memory line thus labelled.

But during the running of a program, it is possible that several linesin the main memory may wish to be mapped to the same line of the cachememory and enter therefore into conflict, which produces setbacks duringthe operations of accessing the cache memory.

It follows therefrom that such an system has the drawback of having alower hit rate during access to the data in the cache memory than thepreceding system.

In FIG. 4, there has been represented a possible representation of acache memory according to the multibank system termed associative perset.

In such an system, the cache memory AM is subdivided into X banks BCiwith i being less than X and greater than 0, each having a number Li oflines that are capable of containing data D.

Here the number of lines Li is the same in all the banks. In a variant,it could be different according to the banks in question.

The banks BC1 have ultrafast accessibility. They are made, for example,in a static RAM technology with an access time of the order of 6 to 1210⁻⁹ seconds.

These lines LI can be individually designated by a local address ALI.

For example, the cache memory is subdivided into two banks individuallyrepresented at BC1 and BC2, each having 8 lines individually representedat L10 to L17 for BC1 and L20 to L27 for BC2.

In practice, the means CL respond to a request REQ containing a mainaddress, and possibly data, by connecting the main address AP containedin this request, and the same local address LA in each of the banks BCiaccording to a predetermined law f, the line thus designated in the bankBCi being the only line of the said bank BCi that is capable ofcontaining the data item labelled by the main address AP.

In other words, a data line in the main memory may be mapped in any ofthe lines of the set constituted by the lines of the local address AL inthe banks BCi. The local address AL is determined by the main addressAP, most frequently, the local address is directly derived from the bitsof the lowest weighting of the main address AP.

However, such an system is not entirely satisfactory, inasmuch as theaddressing of the banks is effected jointly according to the samepredetermined law f. In other words, the data lines in the main memoryare loaded into one or the other of the banks, and this at the samelocal address in each bank.

It follows therefrom, that with such an system, that is to say, withjoint addressing of the banks, the average hit rate during access to thedata in the cache memory may sometimes be relatively low.

For example, when. (x+1) data lines in the main memory relating to thesame application have to be mapped in the set constituted by lines ofthe same local address AL, the (x+1) data lines cannot be presenttogether in the cache memory, which introduces conflicts.

The Applicant has set himself the task of providing a solution to thisproblem.

The solution brought to this problem in accordance with the inventionlies in introducing into a multibank system of the cache memory the useof local addressing functions that are distinct for the memory banks,and therefore a separate addressing system of the banks.

Reference will now be made to FIG. 5 which schematically represents acache memory arranged according to an associative system modified inaccordance with the invention.

The cache memory AM subdivided into two banks BC1 and BC2 havingrespectively L1 and L2 lines will again be found; the lines contain dataD and are individually designated by a local address AL.

For the bank BC1, the computing means CAL1 responds to a request byconnecting the main address AP contained in the request to a localaddress AL1 in the bank BC1 according to a predetermined law F1, theline thus designated in the bank BC1 being the only line of the saidbank BC1 that is capable of containing the data item labelled by themain address AP.

Similarly, for the bank BC2, the computing means CAL2 responds to arequest by connecting the main address AP contained in this request to alocal address AL2 in the bank BC2 according to a predetermined law F2,the line thus designated in the bank BC2 being the only line of the saidbank BC2 that is capable of containing the data item labelled by themain address AP.

Surprisingly, the Applicant has found that by replacing the jointaddressing of the banks described with reference to FIG. 4 by separateaddressing and by making the two laws distinct according to the twobanks in question, the hit rate during access to the data in the cachememory is improved.

Indeed, with a separate addressing of the banks, the data lines in themain memory are now loaded into one or the other of the banks, and thisto local addresses that may differ from one bank to the other.

Thus when (x+1) data lines in the main memory come into conflict forbeing mapped in the same line of the cache memory in the bank BC1, it ispossible that they will not come into conflict in the other banks BCj ofthe cache memory and may thus be present at the same time in the cachememory, which makes it possible to avoid certain setbacks during accessto the cache memory.

To permit separate addressing of the banks, it is necessary todifferentiate the laws connecting the main address contained in arequest and a local address in the bank in question.

To obtain a better hit rate in accessing the cache memory, it isnecessary to choose the laws fi carefully.

The Applicant has found first of all that the laws fi must be equitable.

A law fi connecting the main address of a data line to the local addressin the bank BCi is said to be equitable if, for each line of the bankBCi, the number of lines of data D that can be mapped in the said lineis unique and equal to that of all the lines of the bank BCi.

Subsequently, the Applicant has found that the laws fi must bedispersive relative to one another.

A law fi is said to be a law of dispersion relative to the law Fj if thelaw Fj restricted to the set of lines that can be mapped in apredetermined line of the bank BCi is equitable.

Finally, the Applicant has found that the laws fi must not have anyspatial locality.

Indeed, many applications have a spatial locality, that is to say, thatin these applications the data used in a short lapse of time have mainaddresses that are relatively close to one another.

Now, to prevent any conflicts from arising, it is desirable to chooselaws fi which make it possible to prevent two lines whose main addresseswould be close to one another (that is to say, being almost consecutive)from being mapped in the same line of the bank BCi.

We will now describe a group example of the laws fi applied to a cachememory constituted by four banks of 2^(n) lines of 2^(c) octets each. Itwill be assumed that the main memory has 2^(q) octets where q≧2×n+c.

Let us consider the binary representation of a main address AP in fourstrings or bits AP=(A3, A2, A1, A0) where A0 is a string of c bitsrepresenting the displacement in the lines, where A1 and A2 are twostrings of n bits and where A3 is the string of the most significantq−(2×n+c) bits.

If (y_(n), Y_(−n−1), . . . , y₁) is the binary representation ofy=Σi=1,n y_(i) 2^(i−1), let us consider the function H defined byformula I in the Annex and the four laws fi defined by the formulae IIto V in the Annex.

The expert will understand that the laws fi to f4 are equitable.

Moreover, for each pair (i, j) in {1,2,3,4}, the law fi is equitablerelative to fj for values of n=3,4,6,7,9,10,12,13,15 and 16.

Finally, the local dispersion of data in a single bank is virtuallyoptimal; whatever the cache memory line in question, in a set of K×2^(n)data lines of consecutive addresses, there are at most K+1 lines thatcan be mapped in the said line.

It should be observed that the establishment of the laws described aboveis simple.

Indeed, each bit of fi (AP) is obtained by the EXCLUSIVE OR of at most 4bits of the binary chop of the main address AP.

Moreover, the necessary material for establishing the laws fi is thesame whatever the law: it is a matter of computing H(x)⊕H⁻¹ (y)⊕z wherex, y, z are strings of n bits.

A mechanism for establishing this law in accordance with the inventionis represented in FIG. 6.

To compute H(x)⊕H⁻¹(y)⊕z where x,y,z are strings of 6 bits individuallyrepresented at x6 to x1; y6 to y1 and z6 to z1, 6 XOR gates individuallyrepresented at P1 to P6 are used.

Each gate P has 3 or 4 inputs, each receiving one of the bits of thestrings x, y or z and an output delivering a bit t.

As shown in FIG. 6, one input of the gate P1 and one input of the gateP2 receive for example, the bit x6.

The setting up of the inputs of the XOR gates represented in FIG. 6forms one example of the embodiment of the invention.

Of course, other set-ups in accordance with the invention make itpossible to check the properties of the above mentioned laws.

The Applicant has found that for cache memories of equal sizes, thebehaviour of a cache memory with two banks arranged in accordance withthe invention has a distinctly higher hit rate than that of a two-bankcache memory that is associative per set and is approximately comparableto that of a four-bank cache memory that is associative per set. Thebehaviour of a four-bank cache memory arranged in accordance with theinvention has a hit rate that is higher than that of a four-bank cachememory that is associative per set and is approximately comparable tothat of an-eight bank cache memory that is associative per set.

ANNEX

Formula I

H:{0, . . . , 2^(n)−1}{0, . . . , 2^(n)−1}

{y_(n), y_(n−1), . . . , y₁}y_(n)⊕{y₁,y_(n),y_(n−1), . . . , y₃,y₂}

where ⊕ is the OR EXCLUSIVE OPERATION (XOR)

Formula II

f₁:S{0, . . . , 2^(n)−1}

(A₃,A₂,A₁,A₀)H(A₁)⊕H⁻¹(A₂)⊕A₂

Formula III

f₂:S{0, . . . , 2^(n)−1}

(A₃,A₂,A₁,A₀)H(A₁)⊕H⁻¹(A₂)⊕A₁

Formula IV

f₃:S{0₁, . . . , 2^(n)−1}

(A₃,A₂,A₁,A₀)H⁻¹(A₁)⊕H(A₂)⊕A₂

Formula V

f₄:S{0₁, . . . , 2^(n)−1}

(A₃,A₂,A₁,A₀)H⁻¹(A₁)⊕H(A₂)⊕A₁.

What is claimed is:
 1. An improved cache memory device for use in a data processing system which includes an addressable main memory (MP); at least one request input/output (ESRQ) for receiving a request (REQ) for access to a data item stored in the addressable main memory (MP) or in the cache memory device, the request (RQ) including a main address (AP) of the desired data item; at least one main memory input/output (ESMP) connected to the main addressable memory (MP) for accessing the desired data item of the main memory; a plurality of X memory banks (BCi) with i being less than or equal to X and greater than 0, each having a number Li of lines capable of containing data, these lines being capable of being individually designated by a local address (ALi) in each bank (BCi); computing means (CAL) connected to the request input/output (ESRQ) and capable of answering the request (REQ) by transforming the main address (AP) contained in this request to a local address (AL) inside each of the banks (BCi), the line thus designated in the bank (BCi) being the only line of the said bank that is capable of containing the data labelled by the main address; and loading means (CHA) connected to the main memory input/output (ESMP) for loading the data line of the main memory containing the desired data item into the cache memory device if it is not present in the cache memory device, wherein the improvement comprises: the computing means (CAL) comprises means for transforming the main address (AP) into a first local address in a first one of the memory banks in accordance with a first predetermined law associated with the first one of the memory banks, and for transforming the main address (AP) into a second local address in a second one of the memory banks in accordance with a second predetermined law which is associated with the second one of the memory banks, the first and second predetermined laws are distinct, and the first and second memory banks are addressed separately, according to their respective law.
 2. A device according to claim 1, wherein laws for transforming the main address (AP) into local addresses are different for all the memory banks.
 3. A device according to claim 1, wherein the first and second predetermined laws are equitable laws.
 4. A device according to claim 1, wherein the first and second predetermined laws are dispersive laws.
 5. A device according to claim 1, wherein the first and second predetermined laws are laws which do not have any spatial locality. 